Multi-Relational Data Mining using Probabilistic Models Research Summary
نویسنده
چکیده
We are often faced with the challenge of mining data represented in relational form. Unfortunately, most statistical learning methods work only with “flat” data representations. Thus, to apply these methods, we are forced to convert the data into a flat form, thereby not only losing its compact representation and structure but also potentially introducing statistical skew. These drawbacks severely limit the ability of current statistical methods to mine relational databases. Probabilistic models, in particular probabilistic relational models, allow us to represent a statistical model over a relational domain. These models can represent correlations between attributes within a single table, and between attributes in multiple tables, when these tables are related via foreign key joins. In previous work [4, 6, 8], we have developed algorithms for automatically constructing a probabilistic relational model directly from a relational database. We survey the results here and describe how the methods can be used to discover interesting dependencies the data. We show how this class of models and our construction algorithm are ideally suited to mining multi-relational data.
منابع مشابه
Learning Multi-Relational Semantics Using Neural-Embedding Models
Real-world entities (e.g., people and places) are often connected via relations, forming multirelational data. Modeling multi-relational data is important in many research areas, from natural language processing to biological data mining [6]. Prior work on multi-relational learning can be categorized into three categories: (1) statistical relational learning (SRL) [10], such as Markovlogic netw...
متن کاملRelational Data Mining Using Probabilistic Relational Models
This thesis documents the design, implementation and test of Probabilistic Relational Models (PRMs). PRMs are a graphical statistical approach to modeling relational data using the Relational Language. PRMs consist of two components; the dependency structure and the parameters. Our design is based on simplicity, flexibility , and performance. We explain the search over possible structures, usin...
متن کاملMulti-relational Data Mining in Medical Databases
This paper presents the application of a method for mining data in a multi-relational database that contains some information about patients strucked down by chronic hepatitis. Our approach may be used on any kind of multirelational database and aims at extracting probabilistic tree patterns from a database using Grammatical Inference techniques. We propose to use a representation of the databa...
متن کاملProbabilistic Relational Model Benchmark Generation
The validation of any database mining methodology goes through an evaluation process where benchmarks availability is essential. In this paper, we aim to randomly generate relational database benchmarks that allow to check probabilistic dependencies among the attributes. We are particularly interested in Probabilistic Relational Models (PRMs), which extend Bayesian Networks (BNs) to a relationa...
متن کاملAn efficient approach for effectual mining of relational patterns from multi-relational database
Data mining is an extremely challenging and hopeful research topic due to its well-built application potential and the broad accessibility of the massive quantities of data in databases. Still, the rising significance of data mining in practical real world necessitates ever more complicated solutions while data includes of a huge amount of records which may be stored in various tables of a rela...
متن کامل